Overview

Dataset statistics

Number of variables28
Number of observations52924
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory11.7 MiB
Average record size in memory232.0 B

Variable types

Numeric12
DateTime2
Categorical12
Unsupported2

Warnings

year_number has constant value "2019" Constant
Product_SKU has a high cardinality: 1145 distinct values High cardinality
Product_Description has a high cardinality: 404 distinct values High cardinality
Transaction_ID is highly correlated with week_number and 1 other fieldsHigh correlation
week_number is highly correlated with Transaction_ID and 1 other fieldsHigh correlation
month_number is highly correlated with Transaction_ID and 1 other fieldsHigh correlation
Location is highly correlated with year_numberHigh correlation
Product_Category is highly correlated with year_number and 1 other fieldsHigh correlation
year_number is highly correlated with Location and 8 other fieldsHigh correlation
Coupon_Code is highly correlated with year_number and 2 other fieldsHigh correlation
User_type is highly correlated with year_numberHigh correlation
GST is highly correlated with Product_Category and 2 other fieldsHigh correlation
Coupon_Status is highly correlated with year_numberHigh correlation
Discount_pct is highly correlated with year_number and 1 other fieldsHigh correlation
Gender is highly correlated with year_numberHigh correlation
revenue_seg is highly correlated with year_numberHigh correlation
revenue is highly skewed (γ1 = 70.05685108) Skewed
Transaction_Date_Month_x is an unsupported type, check if it needs cleaning or further analysis Unsupported
Transaction_Date_Month_y is an unsupported type, check if it needs cleaning or further analysis Unsupported

Reproduction

Analysis started2022-01-08 16:01:19.945592
Analysis finished2022-01-08 16:04:25.316616
Duration3 minutes and 5.37 seconds
Software versionpandas-profiling v2.11.0
Download configurationconfig.yaml

Variables

CustomerID
Real number (ℝ≥0)

Distinct1468
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15346.70981
Minimum12346
Maximum18283
Zeros0
Zeros (%)0.0%
Memory size826.9 KiB
2022-01-08T21:34:25.425966image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum12346
5-th percentile12585
Q113869
median15311
Q316996.25
95-th percentile17967
Maximum18283
Range5937
Interquartile range (IQR)3127.25

Descriptive statistics

Standard deviation1766.55602
Coefficient of variation (CV)0.1151097559
Kurtosis-1.234304871
Mean15346.70981
Median Absolute Deviation (MAD)1569
Skewness-0.03263921818
Sum812209270
Variance3120720.173
MonotocityNot monotonic
2022-01-08T21:34:25.541131image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12748695
 
1.3%
15311587
 
1.1%
14606575
 
1.1%
17841572
 
1.1%
14911523
 
1.0%
13089366
 
0.7%
15039315
 
0.6%
17850297
 
0.6%
14646290
 
0.5%
13081261
 
0.5%
Other values (1458)48443
91.5%
ValueCountFrequency (%)
123462
 
< 0.1%
1234760
0.1%
1234823
 
< 0.1%
1235017
 
< 0.1%
1235636
0.1%
ValueCountFrequency (%)
18283102
0.2%
182771
 
< 0.1%
182698
 
< 0.1%
1826040
 
0.1%
182597
 
< 0.1%

Transaction_ID
Real number (ℝ≥0)

HIGH CORRELATION

Distinct25061
Distinct (%)47.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32409.82567
Minimum16679
Maximum48497
Zeros0
Zeros (%)0.0%
Memory size826.9 KiB
2022-01-08T21:34:25.685628image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum16679
5-th percentile18487
Q125384
median32625.5
Q339126.25
95-th percentile46553.7
Maximum48497
Range31818
Interquartile range (IQR)13742.25

Descriptive statistics

Standard deviation8648.668977
Coefficient of variation (CV)0.2668533013
Kurtosis-1.025369056
Mean32409.82567
Median Absolute Deviation (MAD)6872
Skewness0.005581522873
Sum1715257614
Variance74799475.07
MonotocityNot monotonic
2022-01-08T21:34:35.894800image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3252635
 
0.1%
2295830
 
0.1%
4080729
 
0.1%
3409428
 
0.1%
3805927
 
0.1%
2585426
 
< 0.1%
4185326
 
< 0.1%
2482024
 
< 0.1%
3355523
 
< 0.1%
3322823
 
< 0.1%
Other values (25051)52653
99.5%
ValueCountFrequency (%)
166791
 
< 0.1%
166801
 
< 0.1%
166811
 
< 0.1%
1668210
< 0.1%
166842
 
< 0.1%
ValueCountFrequency (%)
484971
< 0.1%
484961
< 0.1%
484951
< 0.1%
484941
< 0.1%
484931
< 0.1%
Distinct365
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size826.9 KiB
Minimum2019-01-01 00:00:00
Maximum2019-12-31 00:00:00
2022-01-08T21:34:36.035424image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:36.169835image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Product_SKU
Categorical

HIGH CARDINALITY

Distinct1145
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Memory size826.9 KiB
GGOENEBJ079499
 
3511
GGOENEBQ078999
 
3328
GGOENEBB078899
 
3230
GGOENEBQ079099
 
1361
GGOENEBQ084699
 
1089
Other values (1140)
40405 

Length

Max length14
Median length14
Mean length13.99981105
Min length12

Characters and Unicode

Total characters740926
Distinct characters34
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique90 ?
Unique (%)0.2%

Sample

1st rowGGOENEBJ079499
2nd rowGGOENEBJ079499
3rd rowGGOENEBQ078999
4th rowGGOENEBQ079099
5th rowGGOENEBJ079499
ValueCountFrequency (%)
GGOENEBJ0794993511
 
6.6%
GGOENEBQ0789993328
 
6.3%
GGOENEBB0788993230
 
6.1%
GGOENEBQ0790991361
 
2.6%
GGOENEBQ0846991089
 
2.1%
GGOENEBQ0791991065
 
2.0%
GGOENEBQ086799844
 
1.6%
GGOEGFKQ020399806
 
1.5%
GGOENEBQ086499599
 
1.1%
GGOEGDHC018299583
 
1.1%
Other values (1135)36508
69.0%
2022-01-08T21:34:36.468960image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
ggoenebj0794993511
 
6.6%
ggoenebq0789993328
 
6.3%
ggoenebb0788993230
 
6.1%
ggoenebq0790991361
 
2.6%
ggoenebq0846991089
 
2.1%
ggoenebq0791991065
 
2.0%
ggoenebq086799844
 
1.6%
ggoegfkq020399806
 
1.5%
ggoenebq086499599
 
1.1%
ggoegdhc018299583
 
1.1%
Other values (1135)36508
69.0%

Most occurring characters

ValueCountFrequency (%)
G137525
18.6%
986049
11.6%
E73488
9.9%
066396
 
9.0%
O57949
 
7.8%
135309
 
4.8%
A32219
 
4.3%
B30198
 
4.1%
725140
 
3.4%
823920
 
3.2%
Other values (24)172733
23.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter423376
57.1%
Decimal Number317550
42.9%

Most frequent character per category

ValueCountFrequency (%)
G137525
32.5%
E73488
17.4%
O57949
13.7%
A32219
 
7.6%
B30198
 
7.1%
N16666
 
3.9%
Q15875
 
3.7%
J11340
 
2.7%
H8050
 
1.9%
C7226
 
1.7%
Other values (14)32840
 
7.8%
ValueCountFrequency (%)
986049
27.1%
066396
20.9%
135309
11.1%
725140
 
7.9%
823920
 
7.5%
319698
 
6.2%
417305
 
5.4%
615065
 
4.7%
214669
 
4.6%
513999
 
4.4%

Most occurring scripts

ValueCountFrequency (%)
Latin423376
57.1%
Common317550
42.9%

Most frequent character per script

ValueCountFrequency (%)
G137525
32.5%
E73488
17.4%
O57949
13.7%
A32219
 
7.6%
B30198
 
7.1%
N16666
 
3.9%
Q15875
 
3.7%
J11340
 
2.7%
H8050
 
1.9%
C7226
 
1.7%
Other values (14)32840
 
7.8%
ValueCountFrequency (%)
986049
27.1%
066396
20.9%
135309
11.1%
725140
 
7.9%
823920
 
7.5%
319698
 
6.2%
417305
 
5.4%
615065
 
4.7%
214669
 
4.6%
513999
 
4.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII740926
100.0%

Most frequent character per block

ValueCountFrequency (%)
G137525
18.6%
986049
11.6%
E73488
9.9%
066396
 
9.0%
O57949
 
7.8%
135309
 
4.8%
A32219
 
4.3%
B30198
 
4.1%
725140
 
3.4%
823920
 
3.2%
Other values (24)172733
23.3%

Product_Description
Categorical

HIGH CARDINALITY

Distinct404
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size826.9 KiB
Nest Learning Thermostat 3rd Gen-USA - Stainless Steel
 
3511
Nest Cam Outdoor Security Camera - USA
 
3328
Nest Cam Indoor Security Camera - USA
 
3230
Google Sunglasses
 
1523
Nest Protect Smoke + CO White Battery Alarm-USA
 
1361
Other values (399)
39971 

Length

Max length59
Median length37
Mean length34.16423551
Min length8

Characters and Unicode

Total characters1808108
Distinct characters74
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st rowNest Learning Thermostat 3rd Gen-USA - Stainless Steel
2nd rowNest Learning Thermostat 3rd Gen-USA - Stainless Steel
3rd rowNest Cam Outdoor Security Camera - USA
4th rowNest Protect Smoke + CO White Battery Alarm-USA
5th rowNest Learning Thermostat 3rd Gen-USA - Stainless Steel
ValueCountFrequency (%)
Nest Learning Thermostat 3rd Gen-USA - Stainless Steel3511
 
6.6%
Nest Cam Outdoor Security Camera - USA3328
 
6.3%
Nest Cam Indoor Security Camera - USA3230
 
6.1%
Google Sunglasses1523
 
2.9%
Nest Protect Smoke + CO White Battery Alarm-USA1361
 
2.6%
Nest Learning Thermostat 3rd Gen-USA - White1089
 
2.1%
Nest Protect Smoke + CO White Wired Alarm-USA1065
 
2.0%
Google 22 oz Water Bottle902
 
1.7%
Nest Thermostat E - USA844
 
1.6%
Google Laptop and Cell Phone Stickers806
 
1.5%
Other values (394)35265
66.6%
2022-01-08T21:34:36.765795image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
google21790
 
7.2%
17529
 
5.8%
nest16528
 
5.5%
tee11533
 
3.8%
men's9080
 
3.0%
usa8742
 
2.9%
sleeve7964
 
2.6%
cam7448
 
2.5%
short7241
 
2.4%
security6697
 
2.2%
Other values (392)186249
61.9%

Most occurring characters

ValueCountFrequency (%)
248342
 
13.7%
e228054
 
12.6%
o131832
 
7.3%
t107130
 
5.9%
a91639
 
5.1%
r90077
 
5.0%
l81603
 
4.5%
n67451
 
3.7%
S62355
 
3.4%
s58792
 
3.3%
Other values (64)640833
35.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1177477
65.1%
Uppercase Letter318134
 
17.6%
Space Separator248342
 
13.7%
Dash Punctuation23080
 
1.3%
Decimal Number19806
 
1.1%
Other Punctuation18283
 
1.0%
Math Symbol2531
 
0.1%
Currency Symbol159
 
< 0.1%
Open Punctuation148
 
< 0.1%
Close Punctuation148
 
< 0.1%

Most frequent character per category

ValueCountFrequency (%)
S62355
19.6%
G30771
9.7%
C27982
8.8%
T24893
 
7.8%
A23994
 
7.5%
N19477
 
6.1%
B18430
 
5.8%
U16517
 
5.2%
W14268
 
4.5%
M12393
 
3.9%
Other values (16)67054
21.1%
ValueCountFrequency (%)
e228054
19.4%
o131832
11.2%
t107130
9.1%
a91639
 
7.8%
r90077
 
7.7%
l81603
 
6.9%
n67451
 
5.7%
s58792
 
5.0%
i48194
 
4.1%
g41246
 
3.5%
Other values (16)231459
19.7%
ValueCountFrequency (%)
35401
27.3%
04048
20.4%
13489
17.6%
23443
17.4%
41260
 
6.4%
5926
 
4.7%
7547
 
2.8%
6408
 
2.1%
8258
 
1.3%
926
 
0.1%
ValueCountFrequency (%)
'13469
73.7%
/1932
 
10.6%
%1726
 
9.4%
&853
 
4.7%
.159
 
0.9%
;144
 
0.8%
ValueCountFrequency (%)
248342
100.0%
ValueCountFrequency (%)
-23080
100.0%
ValueCountFrequency (%)
+2531
100.0%
ValueCountFrequency (%)
(148
100.0%
ValueCountFrequency (%)
)148
100.0%
ValueCountFrequency (%)
$159
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1495611
82.7%
Common312497
 
17.3%

Most frequent character per script

ValueCountFrequency (%)
e228054
15.2%
o131832
 
8.8%
t107130
 
7.2%
a91639
 
6.1%
r90077
 
6.0%
l81603
 
5.5%
n67451
 
4.5%
S62355
 
4.2%
s58792
 
3.9%
i48194
 
3.2%
Other values (42)528484
35.3%
ValueCountFrequency (%)
248342
79.5%
-23080
 
7.4%
'13469
 
4.3%
35401
 
1.7%
04048
 
1.3%
13489
 
1.1%
23443
 
1.1%
+2531
 
0.8%
/1932
 
0.6%
%1726
 
0.6%
Other values (12)5036
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII1808108
100.0%

Most frequent character per block

ValueCountFrequency (%)
248342
 
13.7%
e228054
 
12.6%
o131832
 
7.3%
t107130
 
5.9%
a91639
 
5.1%
r90077
 
5.0%
l81603
 
4.5%
n67451
 
3.7%
S62355
 
3.4%
s58792
 
3.3%
Other values (64)640833
35.4%

Product_Category
Categorical

HIGH CORRELATION

Distinct20
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size826.9 KiB
Apparel
18126 
Nest-USA
14013 
Office
6513 
Drinkware
3483 
Lifestyle
3092 
Other values (15)
7697 

Length

Max length20
Median length7
Mean length7.374650442
Min length3

Characters and Unicode

Total characters390296
Distinct characters38
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNest-USA
2nd rowNest-USA
3rd rowNest-USA
4th rowNest-USA
5th rowNest-USA
ValueCountFrequency (%)
Apparel18126
34.2%
Nest-USA14013
26.5%
Office6513
 
12.3%
Drinkware3483
 
6.6%
Lifestyle3092
 
5.8%
Nest2198
 
4.2%
Bags1882
 
3.6%
Headgear771
 
1.5%
Notebooks & Journals749
 
1.4%
Waze554
 
1.0%
Other values (10)1543
 
2.9%
2022-01-08T21:34:37.031362image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
apparel18126
33.2%
nest-usa14013
25.7%
office6513
 
11.9%
drinkware3483
 
6.4%
lifestyle3092
 
5.7%
nest2198
 
4.0%
bags1928
 
3.5%
headgear771
 
1.4%
749
 
1.4%
journals749
 
1.4%
Other values (13)3005
 
5.5%

Most occurring characters

ValueCountFrequency (%)
e54810
14.0%
p36341
 
9.3%
A32416
 
8.3%
a27792
 
7.1%
r27216
 
7.0%
s24508
 
6.3%
l22340
 
5.7%
t21064
 
5.4%
N17277
 
4.4%
f16277
 
4.2%
Other values (28)110255
28.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter277280
71.0%
Uppercase Letter96234
 
24.7%
Dash Punctuation14330
 
3.7%
Space Separator1703
 
0.4%
Other Punctuation749
 
0.2%

Most frequent character per category

ValueCountFrequency (%)
e54810
19.8%
p36341
13.1%
a27792
10.0%
r27216
9.8%
s24508
8.8%
l22340
8.1%
t21064
 
7.6%
f16277
 
5.9%
i13524
 
4.9%
c7159
 
2.6%
Other values (10)26249
9.5%
ValueCountFrequency (%)
A32416
33.7%
N17277
18.0%
U14013
14.6%
S14013
14.6%
O6513
 
6.8%
D3483
 
3.6%
L3092
 
3.2%
B2285
 
2.4%
H893
 
0.9%
J749
 
0.8%
Other values (5)1500
 
1.6%
ValueCountFrequency (%)
-14330
100.0%
ValueCountFrequency (%)
1703
100.0%
ValueCountFrequency (%)
&749
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin373514
95.7%
Common16782
 
4.3%

Most frequent character per script

ValueCountFrequency (%)
e54810
14.7%
p36341
 
9.7%
A32416
 
8.7%
a27792
 
7.4%
r27216
 
7.3%
s24508
 
6.6%
l22340
 
6.0%
t21064
 
5.6%
N17277
 
4.6%
f16277
 
4.4%
Other values (25)93473
25.0%
ValueCountFrequency (%)
-14330
85.4%
1703
 
10.1%
&749
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII390296
100.0%

Most frequent character per block

ValueCountFrequency (%)
e54810
14.0%
p36341
 
9.3%
A32416
 
8.3%
a27792
 
7.1%
r27216
 
7.0%
s24508
 
6.3%
l22340
 
5.7%
t21064
 
5.4%
N17277
 
4.4%
f16277
 
4.2%
Other values (28)110255
28.2%

Quantity
Real number (ℝ≥0)

Distinct151
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.497638123
Minimum1
Maximum900
Zeros0
Zeros (%)0.0%
Memory size826.9 KiB
2022-01-08T21:34:37.156334image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile16
Maximum900
Range899
Interquartile range (IQR)1

Descriptive statistics

Standard deviation20.10471082
Coefficient of variation (CV)4.470059679
Kurtosis525.4524846
Mean4.497638123
Median Absolute Deviation (MAD)0
Skewness19.03480155
Sum238033
Variance404.1993973
MonotocityNot monotonic
2022-01-08T21:34:37.281306image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
135336
66.8%
27016
 
13.3%
32288
 
4.3%
51734
 
3.3%
41237
 
2.3%
101035
 
2.0%
20531
 
1.0%
6435
 
0.8%
15389
 
0.7%
25287
 
0.5%
Other values (141)2636
 
5.0%
ValueCountFrequency (%)
135336
66.8%
27016
 
13.3%
32288
 
4.3%
41237
 
2.3%
51734
 
3.3%
ValueCountFrequency (%)
9001
 
< 0.1%
8252
 
< 0.1%
7911
 
< 0.1%
7501
 
< 0.1%
6005
< 0.1%

Avg_Price
Real number (ℝ≥0)

Distinct546
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean52.23764644
Minimum0.39
Maximum355.74
Zeros0
Zeros (%)0.0%
Memory size826.9 KiB
2022-01-08T21:34:37.405484image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0.39
5-th percentile1.99
Q15.7
median16.99
Q3102.13
95-th percentile151.88
Maximum355.74
Range355.35
Interquartile range (IQR)96.43

Descriptive statistics

Standard deviation64.0068816
Coefficient of variation (CV)1.225301788
Kurtosis3.342401681
Mean52.23764644
Median Absolute Deviation (MAD)14.19
Skewness1.632579807
Sum2764625.2
Variance4096.880892
MonotocityNot monotonic
2022-01-08T21:34:37.514802image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1195127
 
9.7%
1493824
 
7.2%
791937
 
3.7%
13.591530
 
2.9%
2.391284
 
2.4%
2.991226
 
2.3%
16.991036
 
2.0%
15.19984
 
1.9%
1.99956
 
1.8%
3.99949
 
1.8%
Other values (536)34071
64.4%
ValueCountFrequency (%)
0.391
 
< 0.1%
0.445
0.1%
0.4115
 
< 0.1%
0.533
0.1%
0.5127
0.1%
ValueCountFrequency (%)
355.74169
0.3%
349330
0.6%
279147
0.3%
274.193
 
< 0.1%
2691
 
< 0.1%

Delivery_Charges
Real number (ℝ≥0)

Distinct267
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.51763038
Minimum0
Maximum521.36
Zeros162
Zeros (%)0.3%
Memory size826.9 KiB
2022-01-08T21:34:37.655392image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile6
Q16
median6
Q36.5
95-th percentile26.43
Maximum521.36
Range521.36
Interquartile range (IQR)0.5

Descriptive statistics

Standard deviation19.47561323
Coefficient of variation (CV)1.85171113
Kurtosis204.6376899
Mean10.51763038
Median Absolute Deviation (MAD)0
Skewness11.95973935
Sum556635.07
Variance379.2995108
MonotocityNot monotonic
2022-01-08T21:34:37.780397image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
626801
50.6%
6.515819
29.9%
12.992532
 
4.8%
19.991042
 
2.0%
12.48798
 
1.5%
12.91454
 
0.9%
8.7325
 
0.6%
0162
 
0.3%
18.47139
 
0.3%
13.38111
 
0.2%
Other values (257)4741
 
9.0%
ValueCountFrequency (%)
0162
 
0.3%
626801
50.6%
6.4614
 
< 0.1%
6.4829
 
0.1%
6.515819
29.9%
ValueCountFrequency (%)
521.361
 
< 0.1%
5042
 
< 0.1%
492.8410
< 0.1%
422.244
 
< 0.1%
3543
 
< 0.1%

Coupon_Status
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size826.9 KiB
Clicked
26926 
Used
17904 
Not Used
8094 

Length

Max length8
Median length7
Mean length6.138047011
Min length4

Characters and Unicode

Total characters324850
Distinct characters13
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUsed
2nd rowUsed
3rd rowNot Used
4th rowClicked
5th rowClicked
ValueCountFrequency (%)
Clicked26926
50.9%
Used17904
33.8%
Not Used8094
 
15.3%
2022-01-08T21:34:38.029533image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2022-01-08T21:34:38.107609image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
clicked26926
44.1%
used25998
42.6%
not8094
 
13.3%

Most occurring characters

ValueCountFrequency (%)
e52924
16.3%
d52924
16.3%
C26926
8.3%
l26926
8.3%
i26926
8.3%
c26926
8.3%
k26926
8.3%
U25998
8.0%
s25998
8.0%
N8094
 
2.5%
Other values (3)24282
7.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter255738
78.7%
Uppercase Letter61018
 
18.8%
Space Separator8094
 
2.5%

Most frequent character per category

ValueCountFrequency (%)
e52924
20.7%
d52924
20.7%
l26926
10.5%
i26926
10.5%
c26926
10.5%
k26926
10.5%
s25998
10.2%
o8094
 
3.2%
t8094
 
3.2%
ValueCountFrequency (%)
C26926
44.1%
U25998
42.6%
N8094
 
13.3%
ValueCountFrequency (%)
8094
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin316756
97.5%
Common8094
 
2.5%

Most frequent character per script

ValueCountFrequency (%)
e52924
16.7%
d52924
16.7%
C26926
8.5%
l26926
8.5%
i26926
8.5%
c26926
8.5%
k26926
8.5%
U25998
8.2%
s25998
8.2%
N8094
 
2.6%
Other values (2)16188
 
5.1%
ValueCountFrequency (%)
8094
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII324850
100.0%

Most frequent character per block

ValueCountFrequency (%)
e52924
16.3%
d52924
16.3%
C26926
8.3%
l26926
8.3%
i26926
8.3%
c26926
8.3%
k26926
8.3%
U25998
8.0%
s25998
8.0%
N8094
 
2.5%
Other values (3)24282
7.5%

Gender
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size826.9 KiB
F
33007 
M
19917 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters52924
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowM
2nd rowM
3rd rowM
4th rowM
5th rowM
ValueCountFrequency (%)
F33007
62.4%
M19917
37.6%
2022-01-08T21:34:38.310719image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2022-01-08T21:34:38.392570image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
f33007
62.4%
m19917
37.6%

Most occurring characters

ValueCountFrequency (%)
F33007
62.4%
M19917
37.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter52924
100.0%

Most frequent character per category

ValueCountFrequency (%)
F33007
62.4%
M19917
37.6%

Most occurring scripts

ValueCountFrequency (%)
Latin52924
100.0%

Most frequent character per script

ValueCountFrequency (%)
F33007
62.4%
M19917
37.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII52924
100.0%

Most frequent character per block

ValueCountFrequency (%)
F33007
62.4%
M19917
37.6%

Location
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size826.9 KiB
Chicago
18380 
California
16136 
New York
11173 
New Jersey
4503 
Washington DC
2732 

Length

Max length13
Median length8
Mean length8.690764115
Min length7

Characters and Unicode

Total characters459950
Distinct characters23
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowChicago
2nd rowChicago
3rd rowChicago
4th rowChicago
5th rowChicago
ValueCountFrequency (%)
Chicago18380
34.7%
California16136
30.5%
New York11173
21.1%
New Jersey4503
 
8.5%
Washington DC2732
 
5.2%
2022-01-08T21:34:38.605903image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2022-01-08T21:34:38.684010image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
chicago18380
25.8%
california16136
22.6%
new15676
22.0%
york11173
15.7%
jersey4503
 
6.3%
dc2732
 
3.8%
washington2732
 
3.8%

Most occurring characters

ValueCountFrequency (%)
i53384
11.6%
a53384
11.6%
o48421
 
10.5%
C37248
 
8.1%
r31812
 
6.9%
e24682
 
5.4%
n21600
 
4.7%
h21112
 
4.6%
g21112
 
4.6%
18408
 
4.0%
Other values (13)128787
28.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter367478
79.9%
Uppercase Letter74064
 
16.1%
Space Separator18408
 
4.0%

Most frequent character per category

ValueCountFrequency (%)
i53384
14.5%
a53384
14.5%
o48421
13.2%
r31812
8.7%
e24682
6.7%
n21600
 
5.9%
h21112
 
5.7%
g21112
 
5.7%
c18380
 
5.0%
l16136
 
4.4%
Other values (6)57455
15.6%
ValueCountFrequency (%)
C37248
50.3%
N15676
21.2%
Y11173
 
15.1%
J4503
 
6.1%
W2732
 
3.7%
D2732
 
3.7%
ValueCountFrequency (%)
18408
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin441542
96.0%
Common18408
 
4.0%

Most frequent character per script

ValueCountFrequency (%)
i53384
12.1%
a53384
12.1%
o48421
11.0%
C37248
 
8.4%
r31812
 
7.2%
e24682
 
5.6%
n21600
 
4.9%
h21112
 
4.8%
g21112
 
4.8%
c18380
 
4.2%
Other values (12)110407
25.0%
ValueCountFrequency (%)
18408
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII459950
100.0%

Most frequent character per block

ValueCountFrequency (%)
i53384
11.6%
a53384
11.6%
o48421
 
10.5%
C37248
 
8.1%
r31812
 
6.9%
e24682
 
5.4%
n21600
 
4.7%
h21112
 
4.6%
g21112
 
4.6%
18408
 
4.0%
Other values (13)128787
28.0%

Tenure_Months
Real number (ℝ≥0)

Distinct49
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26.12799486
Minimum2
Maximum50
Zeros0
Zeros (%)0.0%
Memory size826.9 KiB
2022-01-08T21:34:38.809013image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile5
Q115
median27
Q337
95-th percentile47
Maximum50
Range48
Interquartile range (IQR)22

Descriptive statistics

Standard deviation13.47828519
Coefficient of variation (CV)0.5158560872
Kurtosis-1.117171643
Mean26.12799486
Median Absolute Deviation (MAD)11
Skewness-0.06955528023
Sum1382798
Variance181.6641718
MonotocityNot monotonic
2022-01-08T21:34:38.950429image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=49)
ValueCountFrequency (%)
402043
 
3.9%
251853
 
3.5%
341670
 
3.2%
301656
 
3.1%
331648
 
3.1%
211590
 
3.0%
51543
 
2.9%
451469
 
2.8%
101448
 
2.7%
281417
 
2.7%
Other values (39)36587
69.1%
ValueCountFrequency (%)
2649
1.2%
3625
1.2%
41055
2.0%
51543
2.9%
61296
2.4%
ValueCountFrequency (%)
50737
1.4%
49841
1.6%
48884
1.7%
47497
0.9%
46756
1.4%

GST
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size826.9 KiB
0.18
27343 
0.1
21314 
0.05
4145 
0.12
 
122

Length

Max length4
Median length4
Mean length3.597271559
Min length3

Characters and Unicode

Total characters190382
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.1
2nd row0.1
3rd row0.1
4th row0.1
5th row0.1
ValueCountFrequency (%)
0.1827343
51.7%
0.121314
40.3%
0.054145
 
7.8%
0.12122
 
0.2%
2022-01-08T21:34:39.215993image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2022-01-08T21:34:39.294099image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
0.1827343
51.7%
0.121314
40.3%
0.054145
 
7.8%
0.12122
 
0.2%

Most occurring characters

ValueCountFrequency (%)
057069
30.0%
.52924
27.8%
148779
25.6%
827343
14.4%
54145
 
2.2%
2122
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number137458
72.2%
Other Punctuation52924
 
27.8%

Most frequent character per category

ValueCountFrequency (%)
057069
41.5%
148779
35.5%
827343
19.9%
54145
 
3.0%
2122
 
0.1%
ValueCountFrequency (%)
.52924
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common190382
100.0%

Most frequent character per script

ValueCountFrequency (%)
057069
30.0%
.52924
27.8%
148779
25.6%
827343
14.4%
54145
 
2.2%
2122
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII190382
100.0%

Most frequent character per block

ValueCountFrequency (%)
057069
30.0%
.52924
27.8%
148779
25.6%
827343
14.4%
54145
 
2.2%
2122
 
0.1%

Transaction_Date_Month_x
Unsupported

REJECTED
UNSUPPORTED

Missing0
Missing (%)0.0%
Memory size826.9 KiB

Transaction_Date_Month_y
Unsupported

REJECTED
UNSUPPORTED

Missing0
Missing (%)0.0%
Memory size826.9 KiB

User_type
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size826.9 KiB
New
32033 
Existing
20891 

Length

Max length8
Median length3
Mean length4.973679238
Min length3

Characters and Unicode

Total characters263227
Distinct characters10
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNew
2nd rowNew
3rd rowNew
4th rowNew
5th rowNew
ValueCountFrequency (%)
New32033
60.5%
Existing20891
39.5%
2022-01-08T21:34:39.510406image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2022-01-08T21:34:39.604164image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
new32033
60.5%
existing20891
39.5%

Most occurring characters

ValueCountFrequency (%)
i41782
15.9%
N32033
12.2%
e32033
12.2%
w32033
12.2%
E20891
7.9%
x20891
7.9%
s20891
7.9%
t20891
7.9%
n20891
7.9%
g20891
7.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter210303
79.9%
Uppercase Letter52924
 
20.1%

Most frequent character per category

ValueCountFrequency (%)
i41782
19.9%
e32033
15.2%
w32033
15.2%
x20891
9.9%
s20891
9.9%
t20891
9.9%
n20891
9.9%
g20891
9.9%
ValueCountFrequency (%)
N32033
60.5%
E20891
39.5%

Most occurring scripts

ValueCountFrequency (%)
Latin263227
100.0%

Most frequent character per script

ValueCountFrequency (%)
i41782
15.9%
N32033
12.2%
e32033
12.2%
w32033
12.2%
E20891
7.9%
x20891
7.9%
s20891
7.9%
t20891
7.9%
n20891
7.9%
g20891
7.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII263227
100.0%

Most frequent character per block

ValueCountFrequency (%)
i41782
15.9%
N32033
12.2%
e32033
12.2%
w32033
12.2%
E20891
7.9%
x20891
7.9%
s20891
7.9%
t20891
7.9%
n20891
7.9%
g20891
7.9%

year_number
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size826.9 KiB
2019
52924 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters211696
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2019
2nd row2019
3rd row2019
4th row2019
5th row2019
ValueCountFrequency (%)
201952924
100.0%
2022-01-08T21:34:39.791588image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2022-01-08T21:34:39.869726image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
201952924
100.0%

Most occurring characters

ValueCountFrequency (%)
252924
25.0%
052924
25.0%
152924
25.0%
952924
25.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number211696
100.0%

Most frequent character per category

ValueCountFrequency (%)
252924
25.0%
052924
25.0%
152924
25.0%
952924
25.0%

Most occurring scripts

ValueCountFrequency (%)
Common211696
100.0%

Most frequent character per script

ValueCountFrequency (%)
252924
25.0%
052924
25.0%
152924
25.0%
952924
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII211696
100.0%

Most frequent character per block

ValueCountFrequency (%)
252924
25.0%
052924
25.0%
152924
25.0%
952924
25.0%

week_number
Real number (ℝ≥0)

HIGH CORRELATION

Distinct52
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean27.07867886
Minimum1
Maximum52
Zeros0
Zeros (%)0.0%
Memory size826.9 KiB
2022-01-08T21:34:39.947801image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q115
median28
Q339
95-th percentile50
Maximum52
Range51
Interquartile range (IQR)24

Descriptive statistics

Standard deviation14.53683414
Coefficient of variation (CV)0.5368369047
Kurtosis-1.09793299
Mean27.07867886
Median Absolute Deviation (MAD)12
Skewness-0.06662059991
Sum1433112
Variance211.3195469
MonotocityNot monotonic
2022-01-08T21:34:40.088404image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
311524
 
2.9%
291405
 
2.7%
321398
 
2.6%
331344
 
2.5%
351322
 
2.5%
281305
 
2.5%
341304
 
2.5%
501258
 
2.4%
491256
 
2.4%
301238
 
2.3%
Other values (42)39570
74.8%
ValueCountFrequency (%)
11056
2.0%
2829
1.6%
3842
1.6%
4943
1.8%
5926
1.7%
ValueCountFrequency (%)
52512
1.0%
511215
2.3%
501258
2.4%
491256
2.4%
481111
2.1%

month_number
Real number (ℝ≥0)

HIGH CORRELATION

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.65238833
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Memory size826.9 KiB
2022-01-08T21:34:40.213375image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median7
Q39
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.333363532
Coefficient of variation (CV)0.5010777132
Kurtosis-1.091467617
Mean6.65238833
Median Absolute Deviation (MAD)3
Skewness-0.06860865597
Sum352071
Variance11.11131244
MonotocityNot monotonic
2022-01-08T21:34:40.322725image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
86150
11.6%
75251
9.9%
54572
8.6%
124502
8.5%
34346
8.2%
94288
8.1%
64193
7.9%
104164
7.9%
44150
7.8%
14063
7.7%
Other values (2)7245
13.7%
ValueCountFrequency (%)
14063
7.7%
23284
6.2%
34346
8.2%
44150
7.8%
54572
8.6%
ValueCountFrequency (%)
124502
8.5%
113961
7.5%
104164
7.9%
94288
8.1%
86150
11.6%

Coupon_Code
Categorical

HIGH CORRELATION

Distinct46
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size826.9 KiB
SALE20
6373 
SALE30
5915 
SALE10
5838 
ELEC10
4826 
ELEC30
4647 
Other values (41)
25325 

Length

Max length9
Median length6
Mean length5.861820724
Min length4

Characters and Unicode

Total characters310231
Distinct characters30
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowELEC10
2nd rowELEC10
3rd rowELEC10
4th rowELEC10
5th rowELEC10
ValueCountFrequency (%)
SALE206373
12.0%
SALE305915
11.2%
SALE105838
11.0%
ELEC104826
9.1%
ELEC304647
 
8.8%
ELEC204540
 
8.6%
EXTRA102317
 
4.4%
OFF102250
 
4.3%
EXTRA202211
 
4.2%
OFF202202
 
4.2%
Other values (36)11805
22.3%
2022-01-08T21:34:40.617679image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
sale206373
12.0%
sale305915
11.1%
sale105838
10.9%
elec104826
 
9.1%
elec304647
 
8.7%
elec204540
 
8.5%
extra102317
 
4.3%
off102250
 
4.2%
extra202211
 
4.1%
off202202
 
4.1%
Other values (37)12205
22.9%

Most occurring characters

ValueCountFrequency (%)
E56250
18.1%
052524
16.9%
L32139
10.4%
A27948
9.0%
S18126
 
5.8%
217830
 
5.7%
117470
 
5.6%
317224
 
5.6%
C15357
 
5.0%
F13026
 
4.2%
Other values (20)42337
13.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter202383
65.2%
Decimal Number105048
33.9%
Lowercase Letter2400
 
0.8%
Space Separator400
 
0.1%

Most frequent character per category

ValueCountFrequency (%)
E56250
27.8%
L32139
15.9%
A27948
13.8%
S18126
 
9.0%
C15357
 
7.6%
F13026
 
6.4%
O8517
 
4.2%
R7346
 
3.6%
T6843
 
3.4%
X6575
 
3.2%
Other values (11)10256
 
5.1%
ValueCountFrequency (%)
052524
50.0%
217830
 
17.0%
117470
 
16.6%
317224
 
16.4%
ValueCountFrequency (%)
o1200
50.0%
u400
 
16.7%
p400
 
16.7%
n400
 
16.7%
ValueCountFrequency (%)
400
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin204783
66.0%
Common105448
34.0%

Most frequent character per script

ValueCountFrequency (%)
E56250
27.5%
L32139
15.7%
A27948
13.6%
S18126
 
8.9%
C15357
 
7.5%
F13026
 
6.4%
O8517
 
4.2%
R7346
 
3.6%
T6843
 
3.3%
X6575
 
3.2%
Other values (15)12656
 
6.2%
ValueCountFrequency (%)
052524
49.8%
217830
 
16.9%
117470
 
16.6%
317224
 
16.3%
400
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII310231
100.0%

Most frequent character per block

ValueCountFrequency (%)
E56250
18.1%
052524
16.9%
L32139
10.4%
A27948
9.0%
S18126
 
5.8%
217830
 
5.7%
117470
 
5.6%
317224
 
5.6%
C15357
 
5.0%
F13026
 
4.2%
Other values (20)42337
13.6%

Discount_pct
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size826.9 KiB
20.0
17830 
10.0
17470 
30.0
17224 
0.0
 
400

Length

Max length4
Median length4
Mean length3.992441992
Min length3

Characters and Unicode

Total characters211296
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row10.0
2nd row10.0
3rd row10.0
4th row10.0
5th row10.0
ValueCountFrequency (%)
20.017830
33.7%
10.017470
33.0%
30.017224
32.5%
0.0400
 
0.8%
2022-01-08T21:34:40.867620image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2022-01-08T21:34:40.945694image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
20.017830
33.7%
10.017470
33.0%
30.017224
32.5%
0.0400
 
0.8%

Most occurring characters

ValueCountFrequency (%)
0105848
50.1%
.52924
25.0%
217830
 
8.4%
117470
 
8.3%
317224
 
8.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number158372
75.0%
Other Punctuation52924
 
25.0%

Most frequent character per category

ValueCountFrequency (%)
0105848
66.8%
217830
 
11.3%
117470
 
11.0%
317224
 
10.9%
ValueCountFrequency (%)
.52924
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common211296
100.0%

Most frequent character per script

ValueCountFrequency (%)
0105848
50.1%
.52924
25.0%
217830
 
8.4%
117470
 
8.3%
317224
 
8.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII211296
100.0%

Most frequent character per block

ValueCountFrequency (%)
0105848
50.1%
.52924
25.0%
217830
 
8.4%
117470
 
8.3%
317224
 
8.2%
Distinct365
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size826.9 KiB
Minimum2019-01-01 00:00:00
Maximum2019-12-31 00:00:00
2022-01-08T21:34:41.066749image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:41.188799image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Offline_Spend
Real number (ℝ≥0)

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2830.914141
Minimum500
Maximum5000
Zeros0
Zeros (%)0.0%
Memory size826.9 KiB
2022-01-08T21:34:41.329427image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum500
5-th percentile1000
Q12500
median3000
Q33500
95-th percentile4500
Maximum5000
Range4500
Interquartile range (IQR)1000

Descriptive statistics

Standard deviation936.1542467
Coefficient of variation (CV)0.3306897349
Kurtosis0.1019985876
Mean2830.914141
Median Absolute Deviation (MAD)500
Skewness-0.3168074538
Sum149823300
Variance876384.7736
MonotocityNot monotonic
2022-01-08T21:34:41.423155image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
300013529
25.6%
250010333
19.5%
35009008
17.0%
20006019
11.4%
40004882
 
9.2%
15002411
 
4.6%
45002115
 
4.0%
10001840
 
3.5%
500985
 
1.9%
700969
 
1.8%
ValueCountFrequency (%)
500985
 
1.9%
700969
 
1.8%
10001840
 
3.5%
15002411
4.6%
20006019
11.4%
ValueCountFrequency (%)
5000833
 
1.6%
45002115
 
4.0%
40004882
 
9.2%
35009008
17.0%
300013529
25.6%

Online_Spend
Real number (ℝ≥0)

Distinct365
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1893.109119
Minimum320.25
Maximum4556.93
Zeros0
Zeros (%)0.0%
Memory size826.9 KiB
2022-01-08T21:34:41.563746image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum320.25
5-th percentile682.42
Q11252.63
median1837.87
Q32425.35
95-th percentile3396.14
Maximum4556.93
Range4236.68
Interquartile range (IQR)1172.72

Descriptive statistics

Standard deviation807.014092
Coefficient of variation (CV)0.4262903199
Kurtosis-0.1476468586
Mean1893.109119
Median Absolute Deviation (MAD)587.1
Skewness0.4543058812
Sum100190907
Variance651271.7446
MonotocityNot monotonic
2022-01-08T21:34:41.685619image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2819.58335
 
0.6%
2489.36311
 
0.6%
1692.8298
 
0.6%
2155.96292
 
0.6%
985.28291
 
0.5%
2563.83289
 
0.5%
1292.58278
 
0.5%
663.46274
 
0.5%
1331.1269
 
0.5%
1172.96264
 
0.5%
Other values (355)50023
94.5%
ValueCountFrequency (%)
320.25130
0.2%
417.73185
0.3%
465.443
 
0.1%
478.27131
0.2%
484.9160
0.3%
ValueCountFrequency (%)
4556.93141
0.3%
4349.02124
0.2%
4055.3189
0.4%
4019.9361
 
0.1%
3897.2129
0.2%

revenue
Real number (ℝ≥0)

SKEWED

Distinct9831
Distinct (%)18.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean196.8032289
Minimum4.12
Maximum240587.5
Zeros0
Zeros (%)0.0%
Memory size826.9 KiB
2022-01-08T21:34:41.841801image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum4.12
5-th percentile8.8
Q119.59
median51.583
Q3155
95-th percentile427.5
Maximum240587.5
Range240583.38
Interquartile range (IQR)135.41

Descriptive statistics

Standard deviation1747.303145
Coefficient of variation (CV)8.878427223
Kurtosis7732.727063
Mean196.8032289
Median Absolute Deviation (MAD)40.703
Skewness70.05685108
Sum10415614.09
Variance3053068.28
MonotocityNot monotonic
2022-01-08T21:34:41.982393image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1251499
 
2.8%
1551282
 
2.4%
125.5640
 
1.2%
19.59522
 
1.0%
250507
 
1.0%
155.5485
 
0.9%
85380
 
0.7%
16.63360
 
0.7%
21.99330
 
0.6%
129.27314
 
0.6%
Other values (9821)46605
88.1%
ValueCountFrequency (%)
4.121
 
< 0.1%
4.1851
 
< 0.1%
4.5577
< 0.1%
4.651
 
< 0.1%
4.7532
 
< 0.1%
ValueCountFrequency (%)
240587.51
< 0.1%
1156861
< 0.1%
109452.831
< 0.1%
78582.081
< 0.1%
652501
< 0.1%

revenue_per_customer
Real number (ℝ≥0)

Distinct1464
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean22894.63686
Minimum6.3
Maximum306357.752
Zeros0
Zeros (%)0.0%
Memory size826.9 KiB
2022-01-08T21:34:42.115182image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum6.3
5-th percentile957.723
Q13955.713
median8725.835
Q319589.133
95-th percentile113573.419
Maximum306357.752
Range306351.452
Interquartile range (IQR)15633.42

Descriptive statistics

Standard deviation41812.87602
Coefficient of variation (CV)1.826317503
Kurtosis14.84566505
Mean22894.63686
Median Absolute Deviation (MAD)5938.954
Skewness3.63964285
Sum1211675761
Variance1748316601
MonotocityNot monotonic
2022-01-08T21:34:42.250014image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
244509.716695
 
1.3%
106406.408587
 
1.1%
118511.709575
 
1.1%
113573.419572
 
1.1%
134516.474523
 
1.0%
36400.311366
 
0.7%
53675.804315
 
0.6%
60381.669297
 
0.6%
25601.119290
 
0.5%
36940.336261
 
0.5%
Other values (1454)48443
91.5%
ValueCountFrequency (%)
6.31
< 0.1%
7.1921
< 0.1%
7.21
< 0.1%
7.741
< 0.1%
8.091
< 0.1%
ValueCountFrequency (%)
306357.75219
 
< 0.1%
257792.206147
 
0.3%
245611.693157
 
0.3%
244509.716695
1.3%
137971.419163
 
0.3%

revenue_seg
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size826.9 KiB
High_value
18009 
Low_value
17489 
Medium_value
17426 

Length

Max length12
Median length10
Mean length10.32807422
Min length9

Characters and Unicode

Total characters546603
Distinct characters16
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowHigh_value
2nd rowHigh_value
3rd rowHigh_value
4th rowHigh_value
5th rowHigh_value
ValueCountFrequency (%)
High_value18009
34.0%
Low_value17489
33.0%
Medium_value17426
32.9%
2022-01-08T21:34:42.515576image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2022-01-08T21:34:42.593684image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
high_value18009
34.0%
low_value17489
33.0%
medium_value17426
32.9%

Most occurring characters

ValueCountFrequency (%)
u70350
12.9%
e70350
12.9%
_52924
9.7%
v52924
9.7%
a52924
9.7%
l52924
9.7%
i35435
 
6.5%
H18009
 
3.3%
g18009
 
3.3%
h18009
 
3.3%
Other values (6)104745
19.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter440755
80.6%
Uppercase Letter52924
 
9.7%
Connector Punctuation52924
 
9.7%

Most frequent character per category

ValueCountFrequency (%)
u70350
16.0%
e70350
16.0%
v52924
12.0%
a52924
12.0%
l52924
12.0%
i35435
8.0%
g18009
 
4.1%
h18009
 
4.1%
o17489
 
4.0%
w17489
 
4.0%
Other values (2)34852
7.9%
ValueCountFrequency (%)
H18009
34.0%
L17489
33.0%
M17426
32.9%
ValueCountFrequency (%)
_52924
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin493679
90.3%
Common52924
 
9.7%

Most frequent character per script

ValueCountFrequency (%)
u70350
14.3%
e70350
14.3%
v52924
10.7%
a52924
10.7%
l52924
10.7%
i35435
7.2%
H18009
 
3.6%
g18009
 
3.6%
h18009
 
3.6%
L17489
 
3.5%
Other values (5)87256
17.7%
ValueCountFrequency (%)
_52924
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII546603
100.0%

Most frequent character per block

ValueCountFrequency (%)
u70350
12.9%
e70350
12.9%
_52924
9.7%
v52924
9.7%
a52924
9.7%
l52924
9.7%
i35435
 
6.5%
H18009
 
3.3%
g18009
 
3.3%
h18009
 
3.3%
Other values (6)104745
19.2%

Interactions

2022-01-08T21:34:05.365510image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:05.529953image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:05.687924image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:05.833973image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:05.974614image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:06.099584image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:06.239406image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:06.381893image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:06.517529image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:06.651712image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:06.793080image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:06.953680image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:07.098635image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:07.270470image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:07.424307image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:07.568957image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:07.696574image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:07.854457image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:08.012832image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:08.137772image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:08.296015image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:08.436197image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:08.583203image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:08.748624image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:08.877983image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:09.018620image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:09.143577image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:09.276692image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:09.402308image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:09.523072image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:09.655638image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:09.782299image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:09.907637image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:10.058492image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:10.198522image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:10.348913image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:10.469939image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:10.617579image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:10.749446image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:10.889560image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:11.036466image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:11.176631image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:11.296080image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:11.449276image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:11.595111image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:11.727893image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:11.879653image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:12.026967image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:12.177433image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:12.302872image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:12.442847image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:12.570375image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:12.711080image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:12.847319image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:12.979670image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:13.111707image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:13.236662image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:13.361634image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:13.491468image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:13.609483image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:13.750091image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:13.875046image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:14.021564image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:14.153586image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:14.265052image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:14.405065image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:14.544745image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:14.666155image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:14.791129image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:14.916099image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:15.047560image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:15.170543image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:15.295513image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:15.436123image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:15.576713image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:15.717861image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:15.852324image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:15.994359image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:16.136941image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:16.262120image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:16.387137image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:16.512060image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:16.652652image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:16.783676image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:16.926360image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:17.047299image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:17.172138image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:17.298515image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:17.439138image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:17.564077image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:17.705324image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:17.840903image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:17.968529image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:18.116318image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:18.225665image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:18.348427image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:18.473438image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:18.598369image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:18.741024image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:18.874202image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:18.998194image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:19.107528image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:19.216876image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:19.369189image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:19.493400image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:19.615549image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:19.741246image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:19.866764image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:19.976114image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:20.107019image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:20.231959image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:20.356963image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:20.507558image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:20.643280image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:20.791342image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:20.931500image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:21.062680image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:21.190783image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:21.316833image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:21.449053image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:21.552885image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:21.693478image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:21.834035image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:21.979383image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:22.111510image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:22.252055image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:22.392517image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:22.533674image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:22.674397image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:22.806139image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:22.931109image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-08T21:34:23.091366image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2022-01-08T21:34:42.683548image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-01-08T21:34:42.949111image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-01-08T21:34:43.200583image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-01-08T21:34:43.466181image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2022-01-08T21:34:43.743633image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2022-01-08T21:34:23.433337image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-01-08T21:34:24.938604image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

CustomerIDTransaction_IDTransaction_DateProduct_SKUProduct_DescriptionProduct_CategoryQuantityAvg_PriceDelivery_ChargesCoupon_StatusGenderLocationTenure_MonthsGSTTransaction_Date_Month_xTransaction_Date_Month_yUser_typeyear_numberweek_numbermonth_numberCoupon_CodeDiscount_pctMarketing_DateOffline_SpendOnline_Spendrevenuerevenue_per_customerrevenue_seg
017850166792019-01-01GGOENEBJ079499Nest Learning Thermostat 3rd Gen-USA - Stainless SteelNest-USA1153.716.50UsedMChicago120.12019-012019-01New201911ELEC1010.02019-01-0145002424.5144.18960381.669High_value
117850166802019-01-01GGOENEBJ079499Nest Learning Thermostat 3rd Gen-USA - Stainless SteelNest-USA1153.716.50UsedMChicago120.12019-012019-01New201911ELEC1010.02019-01-0145002424.5144.18960381.669High_value
217850166962019-01-01GGOENEBQ078999Nest Cam Outdoor Security Camera - USANest-USA2122.776.50Not UsedMChicago120.12019-012019-01New201911ELEC1010.02019-01-0145002424.5258.54060381.669High_value
317850166992019-01-01GGOENEBQ079099Nest Protect Smoke + CO White Battery Alarm-USANest-USA181.506.50ClickedMChicago120.12019-012019-01New201911ELEC1010.02019-01-0145002424.588.00060381.669High_value
417850167002019-01-01GGOENEBJ079499Nest Learning Thermostat 3rd Gen-USA - Stainless SteelNest-USA1153.716.50ClickedMChicago120.12019-012019-01New201911ELEC1010.02019-01-0145002424.5160.21060381.669High_value
517850167012019-01-01GGOENEBJ079499Nest Learning Thermostat 3rd Gen-USA - Stainless SteelNest-USA1153.716.50ClickedMChicago120.12019-012019-01New201911ELEC1010.02019-01-0145002424.5160.21060381.669High_value
617850167022019-01-01GGOENEBJ079499Nest Learning Thermostat 3rd Gen-USA - Stainless SteelNest-USA2153.716.50ClickedMChicago120.12019-012019-01New201911ELEC1010.02019-01-0145002424.5320.42060381.669High_value
717850167032019-01-01GGOENEBQ079099Nest Protect Smoke + CO White Battery Alarm-USANest-USA281.506.50Not UsedMChicago120.12019-012019-01New201911ELEC1010.02019-01-0145002424.5176.00060381.669High_value
817850167042019-01-01GGOENEBJ079499Nest Learning Thermostat 3rd Gen-USA - Stainless SteelNest-USA1256.886.50UsedMChicago120.12019-012019-01New201911ELEC1010.02019-01-0145002424.5237.04260381.669High_value
917850167102019-01-01GGOENEBJ079499Nest Learning Thermostat 3rd Gen-USA - Stainless SteelNest-USA1153.7128.78ClickedMChicago120.12019-012019-01New201911ELEC1010.02019-01-0145002424.5182.49060381.669High_value

Last rows

CustomerIDTransaction_IDTransaction_DateProduct_SKUProduct_DescriptionProduct_CategoryQuantityAvg_PriceDelivery_ChargesCoupon_StatusGenderLocationTenure_MonthsGSTTransaction_Date_Month_xTransaction_Date_Month_yUser_typeyear_numberweek_numbermonth_numberCoupon_CodeDiscount_pctMarketing_DateOffline_SpendOnline_Spendrevenuerevenue_per_customerrevenue_seg
5291414608277532019-05-11GGOEWEBB082699Waze Mobile Phone Vent MountWaze15.596.0ClickedFCalifornia480.182019-052019-05New2019195WEMP2020.02019-05-1130001801.6611.59011.590Low_value
5291514866249962019-04-07GGOEGHPJ080110Google 5-Panel CapHeadgear115.196.5Not UsedFChicago430.052019-042019-04New2019144HGEAR1010.02019-04-0725002719.4621.69021.690Low_value
5291613029324682019-07-12GGOEAHPA004110Android Wool Heather Cap Heather/BlackHeadgear19.996.0ClickedFNew York350.052019-072019-07New2019287HGEAR1010.02019-07-122500923.4015.99015.990Low_value
5291712503338472019-07-27GGOEGHPJ080310Google Blackout CapHeadgear110.636.0ClickedFChicago360.052019-072019-07New2019307HGEAR1010.02019-07-2725001151.7016.63016.630Low_value
5291815797408962019-10-17GGOENEBQ086499Nest Cam IQ - USANest1199.006.0UsedFCalifornia270.052019-102019-10New20194210NE1010.02019-10-1725001783.56184.500184.500Low_value
5291912990468422019-12-14GGOENEBQ086799Nest Thermostat E - USANest1100.916.5ClickedFCalifornia470.052019-122019-12New20195012NE3030.02019-12-1440003434.31107.410473.878Low_value
5292012990468432019-12-14GGOENEBQ092299Nest Secure Alarm System Starter Pack - USANest1355.746.5UsedFCalifornia470.052019-122019-12New20195012NE3030.02019-12-1440003434.31253.568473.878Low_value
5292112990468432019-12-14GGOENEBQ093499Nest Detect - USANest249.956.5ClickedFCalifornia470.052019-122019-12New20195012NE3030.02019-12-1440003434.31112.900473.878Low_value
5292216333471442019-12-16GGOENEBQ092299Nest Secure Alarm System Starter Pack - USANest1355.746.5ClickedFNew York410.052019-122019-12New20195112NE3030.02019-12-1640003116.98362.240644.490Low_value
5292316333471442019-12-16GGOENEBQ093499Nest Detect - USANest549.956.5Not UsedFNew York410.052019-122019-12New20195112NE3030.02019-12-1640003116.98282.250644.490Low_value